---
title: "Voice in Mastra | Voice"
description: Overview of voice capabilities in Mastra, including text-to-speech, speech-to-text, and real-time speech-to-speech interactions.
---

import { AudioPlayback } from "@site/src/components/AudioPlayback";
import Tabs from "@theme/Tabs";
import TabItem from "@theme/TabItem";

# Voice in Mastra

Mastra's Voice system provides a unified interface for voice interactions, enabling text-to-speech (TTS), speech-to-text (STT), and real-time speech-to-speech (STS) capabilities in your applications.

## Adding Voice to Agents

To learn how to integrate voice capabilities into your agents, check out the [Adding Voice to Agents](/docs/v1/agents/adding-voice) documentation. This section covers how to use both single and multiple voice providers, as well as real-time interactions.

```typescript
import { Agent } from "@mastra/core/agent";
import { OpenAIVoice } from "@mastra/voice-openai";

// Initialize OpenAI voice for TTS

const voiceAgent = new Agent({
  id: "voice-agent",
  name: "Voice Agent",
  instructions:
    "You are a voice assistant that can help users with their tasks.",
  model: "openai/gpt-5.1",
  voice: new OpenAIVoice(),
});
```

You can then use the following voice capabilities:

### Text to Speech (TTS)

Turn your agent's responses into natural-sounding speech using Mastra's TTS capabilities.
Choose from multiple providers like OpenAI, ElevenLabs, and more.

For detailed configuration options and advanced features, check out our [Text-to-Speech guide](./text-to-speech).

<Tabs>
  <TabItem value="openai" label="OpenAI">

```typescript
import { Agent } from "@mastra/core/agent";
import { OpenAIVoice } from "@mastra/voice-openai";
import { playAudio } from "@mastra/node-audio";

const voiceAgent = new Agent({
  id: "voice-agent",
  name: "Voice Agent",
  instructions:
    "You are a voice assistant that can help users with their tasks.",
  model: "openai/gpt-5.1",
  voice: new OpenAIVoice(),
});

const { text } = await voiceAgent.generate("What color is the sky?");

// Convert text to speech to an Audio Stream
const audioStream = await voiceAgent.voice.speak(text, {
  speaker: "default", // Optional: specify a speaker
  responseFormat: "wav", // Optional: specify a response format
});

playAudio(audioStream);
```

Visit the [OpenAI Voice Reference](/reference/v1/voice/openai) for more information on the OpenAI voice provider.

  </TabItem>
  <TabItem value="azure" label="Azure">

```typescript
import { Agent } from "@mastra/core/agent";
import { AzureVoice } from "@mastra/voice-azure";
import { playAudio } from "@mastra/node-audio";

const voiceAgent = new Agent({
  id: "voice-agent",
  name: "Voice Agent",
  instructions:
    "You are a voice assistant that can help users with their tasks.",
  model: "openai/gpt-5.1",
  voice: new AzureVoice(),
});

const { text } = await voiceAgent.generate("What color is the sky?");

// Convert text to speech to an Audio Stream
const audioStream = await voiceAgent.voice.speak(text, {
  speaker: "en-US-JennyNeural", // Optional: specify a speaker
});

playAudio(audioStream);
```

Visit the [Azure Voice Reference](/reference/v1/voice/azure) for more information on the Azure voice provider.

  </TabItem>
  <TabItem value="elevenlabs" label="ElevenLabs">

```typescript
import { Agent } from "@mastra/core/agent";
import { ElevenLabsVoice } from "@mastra/voice-elevenlabs";
import { playAudio } from "@mastra/node-audio";

const voiceAgent = new Agent({
  id: "voice-agent",
  name: "Voice Agent",
  instructions:
    "You are a voice assistant that can help users with their tasks.",
  model: "openai/gpt-5.1",
  voice: new ElevenLabsVoice(),
});

const { text } = await voiceAgent.generate("What color is the sky?");

// Convert text to speech to an Audio Stream
const audioStream = await voiceAgent.voice.speak(text, {
  speaker: "default", // Optional: specify a speaker
});

playAudio(audioStream);
```

Visit the [ElevenLabs Voice Reference](/reference/v1/voice/elevenlabs) for more information on the ElevenLabs voice provider.

  </TabItem>
  <TabItem value="playai" label="PlayAI">

```typescript
import { Agent } from "@mastra/core/agent";
import { PlayAIVoice } from "@mastra/voice-playai";
import { playAudio } from "@mastra/node-audio";

const voiceAgent = new Agent({
  id: "voice-agent",
  name: "Voice Agent",
  instructions:
    "You are a voice assistant that can help users with their tasks.",
  model: "openai/gpt-5.1",
  voice: new PlayAIVoice(),
});

const { text } = await voiceAgent.generate("What color is the sky?");

// Convert text to speech to an Audio Stream
const audioStream = await voiceAgent.voice.speak(text, {
  speaker: "default", // Optional: specify a speaker
});

playAudio(audioStream);
```

Visit the [PlayAI Voice Reference](/reference/v1/voice/playai) for more information on the PlayAI voice provider.

  </TabItem>
  <TabItem value="google" label="Google">

```typescript
import { Agent } from "@mastra/core/agent";
import { GoogleVoice } from "@mastra/voice-google";
import { playAudio } from "@mastra/node-audio";

const voiceAgent = new Agent({
  id: "voice-agent",
  name: "Voice Agent",
  instructions:
    "You are a voice assistant that can help users with their tasks.",
  model: "openai/gpt-5.1",
  voice: new GoogleVoice(),
});

const { text } = await voiceAgent.generate("What color is the sky?");

// Convert text to speech to an Audio Stream
const audioStream = await voiceAgent.voice.speak(text, {
  speaker: "en-US-Studio-O", // Optional: specify a speaker
});

playAudio(audioStream);
```

Visit the [Google Voice Reference](/reference/v1/voice/google) for more information on the Google voice provider.

  </TabItem>
  <TabItem value="cloudflare" label="Cloudflare">

```typescript
import { Agent } from "@mastra/core/agent";
import { CloudflareVoice } from "@mastra/voice-cloudflare";
import { playAudio } from "@mastra/node-audio";

const voiceAgent = new Agent({
  id: "voice-agent",
  name: "Voice Agent",
  instructions:
    "You are a voice assistant that can help users with their tasks.",
  model: "openai/gpt-5.1",
  voice: new CloudflareVoice(),
});

const { text } = await voiceAgent.generate("What color is the sky?");

// Convert text to speech to an Audio Stream
const audioStream = await voiceAgent.voice.speak(text, {
  speaker: "default", // Optional: specify a speaker
});

playAudio(audioStream);
```

Visit the [Cloudflare Voice Reference](/reference/v1/voice/cloudflare) for more information on the Cloudflare voice provider.

  </TabItem>
  <TabItem value="deepgram" label="Deepgram">

```typescript
import { Agent } from "@mastra/core/agent";
import { DeepgramVoice } from "@mastra/voice-deepgram";
import { playAudio } from "@mastra/node-audio";

const voiceAgent = new Agent({
  id: "voice-agent",
  name: "Voice Agent",
  instructions:
    "You are a voice assistant that can help users with their tasks.",
  model: "openai/gpt-5.1",
  voice: new DeepgramVoice(),
});

const { text } = await voiceAgent.generate("What color is the sky?");

// Convert text to speech to an Audio Stream
const audioStream = await voiceAgent.voice.speak(text, {
  speaker: "aura-english-us", // Optional: specify a speaker
});

playAudio(audioStream);
```

Visit the [Deepgram Voice Reference](/reference/v1/voice/deepgram) for more information on the Deepgram voice provider.

  </TabItem>
  <TabItem value="speechify" label="Speechify">

```typescript
import { Agent } from "@mastra/core/agent";
import { SpeechifyVoice } from "@mastra/voice-speechify";
import { playAudio } from "@mastra/node-audio";

const voiceAgent = new Agent({
  id: "voice-agent",
  name: "Voice Agent",
  instructions:
    "You are a voice assistant that can help users with their tasks.",
  model: "openai/gpt-5.1",
  voice: new SpeechifyVoice(),
});

const { text } = await voiceAgent.generate("What color is the sky?");

// Convert text to speech to an Audio Stream
const audioStream = await voiceAgent.voice.speak(text, {
  speaker: "matthew", // Optional: specify a speaker
});

playAudio(audioStream);
```

Visit the [Speechify Voice Reference](/reference/v1/voice/speechify) for more information on the Speechify voice provider.

  </TabItem>
  <TabItem value="sarvam" label="Sarvam">

```typescript
import { Agent } from "@mastra/core/agent";
import { SarvamVoice } from "@mastra/voice-sarvam";
import { playAudio } from "@mastra/node-audio";

const voiceAgent = new Agent({
  id: "voice-agent",
  name: "Voice Agent",
  instructions:
    "You are a voice assistant that can help users with their tasks.",
  model: "openai/gpt-5.1",
  voice: new SarvamVoice(),
});

const { text } = await voiceAgent.generate("What color is the sky?");

// Convert text to speech to an Audio Stream
const audioStream = await voiceAgent.voice.speak(text, {
  speaker: "default", // Optional: specify a speaker
});

playAudio(audioStream);
```

Visit the [Sarvam Voice Reference](/reference/v1/voice/sarvam) for more information on the Sarvam voice provider.

  </TabItem>
  <TabItem value="murf" label="Murf">

```typescript
import { Agent } from "@mastra/core/agent";
import { MurfVoice } from "@mastra/voice-murf";
import { playAudio } from "@mastra/node-audio";

const voiceAgent = new Agent({
  id: "voice-agent",
  name: "Voice Agent",
  instructions:
    "You are a voice assistant that can help users with their tasks.",
  model: "openai/gpt-5.1",
  voice: new MurfVoice(),
});

const { text } = await voiceAgent.generate("What color is the sky?");

// Convert text to speech to an Audio Stream
const audioStream = await voiceAgent.voice.speak(text, {
  speaker: "default", // Optional: specify a speaker
});

playAudio(audioStream);
```

Visit the [Murf Voice Reference](/reference/v1/voice/murf) for more information on the Murf voice provider.

  </TabItem>
</Tabs>

### Speech to Text (STT)

Transcribe spoken content using various providers like OpenAI, ElevenLabs, and more. For detailed configuration options and more, check out [Speech to Text](./speech-to-text).

You can download a sample audio file from [here](https://github.com/mastra-ai/realtime-voice-demo/raw/refs/heads/main/how_can_i_help_you.mp3).

<br />
<AudioPlayback audio="https://github.com/mastra-ai/realtime-voice-demo/raw/refs/heads/main/how_can_i_help_you.mp3" />

<Tabs>
  <TabItem value="openai" label="OpenAI">

```typescript
import { Agent } from "@mastra/core/agent";
import { OpenAIVoice } from "@mastra/voice-openai";
import { createReadStream } from "fs";

const voiceAgent = new Agent({
  id: "voice-agent",
  name: "Voice Agent",
  instructions:
    "You are a voice assistant that can help users with their tasks.",
  model: "openai/gpt-5.1",
  voice: new OpenAIVoice(),
});

// Use an audio file from a URL
const audioStream = await createReadStream("./how_can_i_help_you.mp3");

// Convert audio to text
const transcript = await voiceAgent.voice.listen(audioStream);
console.log(`User said: ${transcript}`);

// Generate a response based on the transcript
const { text } = await voiceAgent.generate(transcript);
```

Visit the [OpenAI Voice Reference](/reference/v1/voice/openai) for more information on the OpenAI voice provider.

  </TabItem>
  <TabItem value="azure" label="Azure">

```typescript
import { createReadStream } from "fs";
import { Agent } from "@mastra/core/agent";
import { AzureVoice } from "@mastra/voice-azure";
import { createReadStream } from "fs";

const voiceAgent = new Agent({
  id: "voice-agent",
  name: "Voice Agent",
  instructions:
    "You are a voice assistant that can help users with their tasks.",
  model: "openai/gpt-5.1",
  voice: new AzureVoice(),
});

// Use an audio file from a URL
const audioStream = await createReadStream("./how_can_i_help_you.mp3");

// Convert audio to text
const transcript = await voiceAgent.voice.listen(audioStream);
console.log(`User said: ${transcript}`);

// Generate a response based on the transcript
const { text } = await voiceAgent.generate(transcript);
```

Visit the [Azure Voice Reference](/reference/v1/voice/azure) for more information on the Azure voice provider.

  </TabItem>
  <TabItem value="elevenlabs" label="ElevenLabs">

```typescript
import { Agent } from "@mastra/core/agent";
import { ElevenLabsVoice } from "@mastra/voice-elevenlabs";
import { createReadStream } from "fs";

const voiceAgent = new Agent({
  id: "voice-agent",
  name: "Voice Agent",
  instructions:
    "You are a voice assistant that can help users with their tasks.",
  model: "openai/gpt-5.1",
  voice: new ElevenLabsVoice(),
});

// Use an audio file from a URL
const audioStream = await createReadStream("./how_can_i_help_you.mp3");

// Convert audio to text
const transcript = await voiceAgent.voice.listen(audioStream);
console.log(`User said: ${transcript}`);

// Generate a response based on the transcript
const { text } = await voiceAgent.generate(transcript);
```

Visit the [ElevenLabs Voice Reference](/reference/v1/voice/elevenlabs) for more information on the ElevenLabs voice provider.

  </TabItem>
  <TabItem value="google" label="Google">

```typescript
import { Agent } from "@mastra/core/agent";
import { GoogleVoice } from "@mastra/voice-google";
import { createReadStream } from "fs";

const voiceAgent = new Agent({
  id: "voice-agent",
  name: "Voice Agent",
  instructions:
    "You are a voice assistant that can help users with their tasks.",
  model: "openai/gpt-5.1",
  voice: new GoogleVoice(),
});

// Use an audio file from a URL
const audioStream = await createReadStream("./how_can_i_help_you.mp3");

// Convert audio to text
const transcript = await voiceAgent.voice.listen(audioStream);
console.log(`User said: ${transcript}`);

// Generate a response based on the transcript
const { text } = await voiceAgent.generate(transcript);
```

Visit the [Google Voice Reference](/reference/v1/voice/google) for more information on the Google voice provider.

  </TabItem>
  <TabItem value="cloudflare" label="Cloudflare">

```typescript
import { Agent } from "@mastra/core/agent";
import { CloudflareVoice } from "@mastra/voice-cloudflare";
import { createReadStream } from "fs";

const voiceAgent = new Agent({
  id: "voice-agent",
  name: "Voice Agent",
  instructions:
    "You are a voice assistant that can help users with their tasks.",
  model: "openai/gpt-5.1",
  voice: new CloudflareVoice(),
});

// Use an audio file from a URL
const audioStream = await createReadStream("./how_can_i_help_you.mp3");

// Convert audio to text
const transcript = await voiceAgent.voice.listen(audioStream);
console.log(`User said: ${transcript}`);

// Generate a response based on the transcript
const { text } = await voiceAgent.generate(transcript);
```

Visit the [Cloudflare Voice Reference](/reference/v1/voice/cloudflare) for more information on the Cloudflare voice provider.

  </TabItem>
  <TabItem value="deepgram" label="Deepgram">

```typescript
import { Agent } from "@mastra/core/agent";
import { DeepgramVoice } from "@mastra/voice-deepgram";
import { createReadStream } from "fs";

const voiceAgent = new Agent({
  id: "voice-agent",
  name: "Voice Agent",
  instructions:
    "You are a voice assistant that can help users with their tasks.",
  model: "openai/gpt-5.1",
  voice: new DeepgramVoice(),
});

// Use an audio file from a URL
const audioStream = await createReadStream("./how_can_i_help_you.mp3");

// Convert audio to text
const transcript = await voiceAgent.voice.listen(audioStream);
console.log(`User said: ${transcript}`);

// Generate a response based on the transcript
const { text } = await voiceAgent.generate(transcript);
```

Visit the [Deepgram Voice Reference](/reference/v1/voice/deepgram) for more information on the Deepgram voice provider.

  </TabItem>
  <TabItem value="sarvam" label="Sarvam">

```typescript
import { Agent } from "@mastra/core/agent";
import { SarvamVoice } from "@mastra/voice-sarvam";
import { createReadStream } from "fs";

const voiceAgent = new Agent({
  id: "voice-agent",
  name: "Voice Agent",
  instructions:
    "You are a voice assistant that can help users with their tasks.",
  model: "openai/gpt-5.1",
  voice: new SarvamVoice(),
});

// Use an audio file from a URL
const audioStream = await createReadStream("./how_can_i_help_you.mp3");

// Convert audio to text
const transcript = await voiceAgent.voice.listen(audioStream);
console.log(`User said: ${transcript}`);

// Generate a response based on the transcript
const { text } = await voiceAgent.generate(transcript);
```

Visit the [Sarvam Voice Reference](/reference/v1/voice/sarvam) for more information on the Sarvam voice provider.

  </TabItem>
</Tabs>

### Speech to Speech (STS)

Create conversational experiences with speech-to-speech capabilities. The unified API enables real-time voice interactions between users and AI agents.
For detailed configuration options and advanced features, check out [Speech to Speech](./speech-to-speech).

<Tabs>
  <TabItem value="openai" label="OpenAI">

```typescript
import { Agent } from "@mastra/core/agent";
import { playAudio, getMicrophoneStream } from "@mastra/node-audio";
import { OpenAIRealtimeVoice } from "@mastra/voice-openai-realtime";

const voiceAgent = new Agent({
  id: "voice-agent",
  name: "Voice Agent",
  instructions:
    "You are a voice assistant that can help users with their tasks.",
  model: "openai/gpt-5.1",
  voice: new OpenAIRealtimeVoice(),
});

// Listen for agent audio responses
voiceAgent.voice.on("speaker", ({ audio }) => {
  playAudio(audio);
});

// Initiate the conversation
await voiceAgent.voice.speak("How can I help you today?");

// Send continuous audio from the microphone
const micStream = getMicrophoneStream();
await voiceAgent.voice.send(micStream);
```

Visit the [OpenAI Voice Reference](/reference/v1/voice/openai-realtime) for more information on the OpenAI voice provider.

  </TabItem>
  <TabItem value="google" label="Google">

```typescript
import { Agent } from "@mastra/core/agent";
import { playAudio, getMicrophoneStream } from "@mastra/node-audio";
import { GeminiLiveVoice } from "@mastra/voice-google-gemini-live";

const voiceAgent = new Agent({
  id: "voice-agent",
  name: "Voice Agent",
  instructions:
    "You are a voice assistant that can help users with their tasks.",
  model: "openai/gpt-5.1",
  voice: new GeminiLiveVoice({
    // Live API mode
    apiKey: process.env.GOOGLE_API_KEY,
    model: "gemini-2.0-flash-exp",
    speaker: "Puck",
    debug: true,
    // Vertex AI alternative:
    // vertexAI: true,
    // project: 'your-gcp-project',
    // location: 'us-central1',
    // serviceAccountKeyFile: '/path/to/service-account.json',
  }),
});

// Connect before using speak/send
await voiceAgent.voice.connect();

// Listen for agent audio responses
voiceAgent.voice.on("speaker", ({ audio }) => {
  playAudio(audio);
});

// Listen for text responses and transcriptions
voiceAgent.voice.on("writing", ({ text, role }) => {
  console.log(`${role}: ${text}`);
});

// Initiate the conversation
await voiceAgent.voice.speak("How can I help you today?");

// Send continuous audio from the microphone
const micStream = getMicrophoneStream();
await voiceAgent.voice.send(micStream);
```

Visit the [Google Gemini Live Reference](/reference/v1/voice/google-gemini-live) for more information on the Google Gemini Live voice provider.

  </TabItem>
</Tabs>

## Voice Configuration

Each voice provider can be configured with different models and options. Below are the detailed configuration options for all supported providers:

<Tabs>
  <TabItem value="openai" label="OpenAI">

```typescript
// OpenAI Voice Configuration
const voice = new OpenAIVoice({
  speechModel: {
    name: "gpt-3.5-turbo", // Example model name
    apiKey: process.env.OPENAI_API_KEY,
    language: "en-US", // Language code
    voiceType: "neural", // Type of voice model
  },
  listeningModel: {
    name: "whisper-1", // Example model name
    apiKey: process.env.OPENAI_API_KEY,
    language: "en-US", // Language code
    format: "wav", // Audio format
  },
  speaker: "alloy", // Example speaker name
});
```

Visit the [OpenAI Voice Reference](/reference/v1/voice/openai) for more information on the OpenAI voice provider.

  </TabItem>
  <TabItem value="azure" label="Azure">

```typescript
// Azure Voice Configuration
const voice = new AzureVoice({
  speechModel: {
    name: "en-US-JennyNeural", // Example model name
    apiKey: process.env.AZURE_SPEECH_KEY,
    region: process.env.AZURE_SPEECH_REGION,
    language: "en-US", // Language code
    style: "cheerful", // Voice style
    pitch: "+0Hz", // Pitch adjustment
    rate: "1.0", // Speech rate
  },
  listeningModel: {
    name: "en-US", // Example model name
    apiKey: process.env.AZURE_SPEECH_KEY,
    region: process.env.AZURE_SPEECH_REGION,
    format: "simple", // Output format
  },
});
```

Visit the [Azure Voice Reference](/reference/v1/voice/azure) for more information on the Azure voice provider.

  </TabItem>
  <TabItem value="elevenlabs" label="ElevenLabs">

```typescript
// ElevenLabs Voice Configuration
const voice = new ElevenLabsVoice({
  speechModel: {
    voiceId: "your-voice-id", // Example voice ID
    model: "eleven_multilingual_v2", // Example model name
    apiKey: process.env.ELEVENLABS_API_KEY,
    language: "en", // Language code
    emotion: "neutral", // Emotion setting
  },
  // ElevenLabs may not have a separate listening model
});
```

Visit the [ElevenLabs Voice Reference](/reference/v1/voice/elevenlabs) for more information on the ElevenLabs voice provider.

  </TabItem>
  <TabItem value="playai" label="PlayAI">

```typescript
// PlayAI Voice Configuration
const voice = new PlayAIVoice({
  speechModel: {
    name: "playai-voice", // Example model name
    speaker: "emma", // Example speaker name
    apiKey: process.env.PLAYAI_API_KEY,
    language: "en-US", // Language code
    speed: 1.0, // Speech speed
  },
  // PlayAI may not have a separate listening model
});
```

Visit the [PlayAI Voice Reference](/reference/v1/voice/playai) for more information on the PlayAI voice provider.

  </TabItem>
  <TabItem value="google" label="Google">

```typescript
// Google Voice Configuration
const voice = new GoogleVoice({
  speechModel: {
    name: "en-US-Studio-O", // Example model name
    apiKey: process.env.GOOGLE_API_KEY,
    languageCode: "en-US", // Language code
    gender: "FEMALE", // Voice gender
    speakingRate: 1.0, // Speaking rate
  },
  listeningModel: {
    name: "en-US", // Example model name
    sampleRateHertz: 16000, // Sample rate
  },
});
```

Visit the [Google Voice Reference](/reference/v1/voice/google) for more information on the Google voice provider.

  </TabItem>
  <TabItem value="cloudflare" label="Cloudflare">

```typescript
// Cloudflare Voice Configuration
const voice = new CloudflareVoice({
  speechModel: {
    name: "cloudflare-voice", // Example model name
    accountId: process.env.CLOUDFLARE_ACCOUNT_ID,
    apiToken: process.env.CLOUDFLARE_API_TOKEN,
    language: "en-US", // Language code
    format: "mp3", // Audio format
  },
  // Cloudflare may not have a separate listening model
});
```

Visit the [Cloudflare Voice Reference](/reference/v1/voice/cloudflare) for more information on the Cloudflare voice provider.

  </TabItem>
  <TabItem value="deepgram" label="Deepgram">

```typescript
// Deepgram Voice Configuration
const voice = new DeepgramVoice({
  speechModel: {
    name: "nova-2", // Example model name
    speaker: "aura-english-us", // Example speaker name
    apiKey: process.env.DEEPGRAM_API_KEY,
    language: "en-US", // Language code
    tone: "formal", // Tone setting
  },
  listeningModel: {
    name: "nova-2", // Example model name
    format: "flac", // Audio format
  },
});
```

Visit the [Deepgram Voice Reference](/reference/v1/voice/deepgram) for more information on the Deepgram voice provider.

  </TabItem>
  <TabItem value="speechify" label="Speechify">

```typescript
// Speechify Voice Configuration
const voice = new SpeechifyVoice({
  speechModel: {
    name: "speechify-voice", // Example model name
    speaker: "matthew", // Example speaker name
    apiKey: process.env.SPEECHIFY_API_KEY,
    language: "en-US", // Language code
    speed: 1.0, // Speech speed
  },
  // Speechify may not have a separate listening model
});
```

Visit the [Speechify Voice Reference](/reference/v1/voice/speechify) for more information on the Speechify voice provider.

  </TabItem>
  <TabItem value="sarvam" label="Sarvam">

```typescript
// Sarvam Voice Configuration
const voice = new SarvamVoice({
  speechModel: {
    name: "sarvam-voice", // Example model name
    apiKey: process.env.SARVAM_API_KEY,
    language: "en-IN", // Language code
    style: "conversational", // Style setting
  },
  // Sarvam may not have a separate listening model
});
```

Visit the [Sarvam Voice Reference](/reference/v1/voice/sarvam) for more information on the Sarvam voice provider.

  </TabItem>
  <TabItem value="murf" label="Murf">

```typescript
// Murf Voice Configuration
const voice = new MurfVoice({
  speechModel: {
    name: "murf-voice", // Example model name
    apiKey: process.env.MURF_API_KEY,
    language: "en-US", // Language code
    emotion: "happy", // Emotion setting
  },
  // Murf may not have a separate listening model
});
```

Visit the [Murf Voice Reference](/reference/v1/voice/murf) for more information on the Murf voice provider.

  </TabItem>
  <TabItem value="openai-realtime" label="OpenAI Realtime">

```typescript
// OpenAI Realtime Voice Configuration
const voice = new OpenAIRealtimeVoice({
  speechModel: {
    name: "gpt-3.5-turbo", // Example model name
    apiKey: process.env.OPENAI_API_KEY,
    language: "en-US", // Language code
  },
  listeningModel: {
    name: "whisper-1", // Example model name
    apiKey: process.env.OPENAI_API_KEY,
    format: "ogg", // Audio format
  },
  speaker: "alloy", // Example speaker name
});
```

For more information on the OpenAI Realtime voice provider, refer to the [OpenAI Realtime Voice Reference](/reference/v1/voice/openai-realtime).

  </TabItem>
  <TabItem value="google-gemini-live" label="Google Gemini Live">

```typescript
// Google Gemini Live Voice Configuration
const voice = new GeminiLiveVoice({
  speechModel: {
    name: "gemini-2.0-flash-exp", // Example model name
    apiKey: process.env.GOOGLE_API_KEY,
  },
  speaker: "Puck", // Example speaker name
  // Google Gemini Live is a realtime bidirectional API without separate speech and listening models
});
```

Visit the [Google Gemini Live Reference](/reference/v1/voice/google-gemini-live) for more information on the Google Gemini Live voice provider.

  </TabItem>
  <TabItem value="aisdk" label="AI SDK">

```typescript
// AI SDK Voice Configuration
import { CompositeVoice } from "@mastra/core/voice";
import { openai } from "@ai-sdk/openai";
import { elevenlabs } from "@ai-sdk/elevenlabs";

// Use AI SDK models directly - no need to install separate packages
const voice = new CompositeVoice({
  input: openai.transcription('whisper-1'),      // AI SDK transcription
  output: elevenlabs.speech('eleven_turbo_v2'),  // AI SDK speech
});

// Works seamlessly with your agent
const voiceAgent = new Agent({
  id: "aisdk-voice-agent",
  name: "AI SDK Voice Agent",
  instructions: "You are a helpful assistant with voice capabilities.",
  model: openai("gpt-5.1"),
  voice,
});
```
  </TabItem>
</Tabs>

### Using Multiple Voice Providers

This example demonstrates how to create and use two different voice providers in Mastra: OpenAI for speech-to-text (STT) and PlayAI for text-to-speech (TTS).

Start by creating instances of the voice providers with any necessary configuration.

```typescript
import { OpenAIVoice } from "@mastra/voice-openai";
import { PlayAIVoice } from "@mastra/voice-playai";
import { CompositeVoice } from "@mastra/core/voice";
import { playAudio, getMicrophoneStream } from "@mastra/node-audio";

// Initialize OpenAI voice for STT
const input = new OpenAIVoice({
  listeningModel: {
    name: "whisper-1",
    apiKey: process.env.OPENAI_API_KEY,
  },
});

// Initialize PlayAI voice for TTS
const output = new PlayAIVoice({
  speechModel: {
    name: "playai-voice",
    apiKey: process.env.PLAYAI_API_KEY,
  },
});

// Combine the providers using CompositeVoice
const voice = new CompositeVoice({
  input,
  output,
});

// Implement voice interactions using the combined voice provider
const audioStream = getMicrophoneStream(); // Assume this function gets audio input
const transcript = await voice.listen(audioStream);

// Log the transcribed text
console.log("Transcribed text:", transcript);

// Convert text to speech
const responseAudio = await voice.speak(`You said: ${transcript}`, {
  speaker: "default", // Optional: specify a speaker,
  responseFormat: "wav", // Optional: specify a response format
});

// Play the audio response
playAudio(responseAudio);
```

### Using AI SDK Model Providers

You can also use AI SDK models directly with `CompositeVoice`:

```typescript
import { CompositeVoice } from "@mastra/core/voice";
import { openai } from "@ai-sdk/openai";
import { elevenlabs } from "@ai-sdk/elevenlabs";
import { playAudio, getMicrophoneStream } from "@mastra/node-audio";

// Use AI SDK models directly - no provider setup needed
const voice = new CompositeVoice({
  input: openai.transcription('whisper-1'),      // AI SDK transcription
  output: elevenlabs.speech('eleven_turbo_v2'),  // AI SDK speech
});

// Works the same way as Mastra providers
const audioStream = getMicrophoneStream();
const transcript = await voice.listen(audioStream);

console.log("Transcribed text:", transcript);

// Convert text to speech
const responseAudio = await voice.speak(`You said: ${transcript}`, {
  speaker: "Rachel", // ElevenLabs voice
});

playAudio(responseAudio);
```

You can also mix AI SDK models with Mastra providers:

```typescript
import { CompositeVoice } from "@mastra/core/voice";
import { PlayAIVoice } from "@mastra/voice-playai";
import { groq } from "@ai-sdk/groq";

const voice = new CompositeVoice({
  input: groq.transcription('whisper-large-v3'),  // AI SDK for STT
  output: new PlayAIVoice(),                       // Mastra provider for TTS
});
```

For more information on the CompositeVoice, refer to the [CompositeVoice Reference](/reference/v1/voice/composite-voice).

## More Resources

- [CompositeVoice](/reference/v1/voice/composite-voice)
- [MastraVoice](/reference/v1/voice/mastra-voice)
- [OpenAI Voice](/reference/v1/voice/openai)
- [OpenAI Realtime Voice](/reference/v1/voice/openai-realtime)
- [Azure Voice](/reference/v1/voice/azure)
- [Google Voice](/reference/v1/voice/google)
- [Google Gemini Live Voice](/reference/v1/voice/google-gemini-live)
- [Deepgram Voice](/reference/v1/voice/deepgram)
- [PlayAI Voice](/reference/v1/voice/playai)
- [Voice Examples](/examples/v1/voice/text-to-speech)
